Rice (Oryza sativa L.) cultivated in the Central Highlands of Vietnam: Dataset on the endophytic microbiome

Rice (Oryza sativa L.) is the main annual crop cultivated in the Central Highlands region of Vietnam. Understanding the endophytic bacterial community of this plant, a new technique for sustainable production can be developed. In this work, a representative sample was obtained by combining rice (RVT variety) root samples collected from five different fields in Dray Sap Commune, Krong Ana District, Dak Lak Province, the Central Highlands of Vietnam. Using the Illumina MiSeq technology, the 16S rRNA metagenomics was applied to the sequencing amplicons library. The QIIME2 matched with the SILVA SSURef reference database was employed to analyze the taxonomic profile, and the PICRUSt2 and MetaCyc databases were used to predict the functional profile of rice endophytic prokaryotes. Results revealed that Enterobacterales was the most predominant class (57.7%) in the bacterial community, and biosynthesis was the primary function of the rice endophytic microbiome (75.95%). Raw sequences obtained in this work are available from the National Center for Biotechnology Information (NCBI) (Bioproject ID: PRJNA994482) and Mendeley Data [1]. Data in this work provide insight into the endophytic microbiome of rice (RVT variety) cultivated in the Central Highlands of Vietnam. These data are valuable for developing a new method for producing locally sustainable rice employing endophytic bacteria. This is the first report on the endophytic microbiome of rice cultivated in this region.

Dataset link: Root endophytic microbiome dataset of rice (Oryza sativa L.) grown in the Central Highlands of Vietnam (Original data) Dataset link: Root endophytic microbiome dataset of rice (Oryza sativa L.) grown in the Central Highlands of Vietnam (Original data)

Keywords:
Rice endophytic microbiome Proteobacteria Gammaproteobacteria biosynthesis a b s t r a c t Rice ( Oryza sativa L.) is the main annual crop cultivated in the Central Highlands region of Vietnam.Understanding the endophytic bacterial community of this plant, a new technique for sustainable production can be developed.In this work, a representative sample was obtained by combining rice (RVT variety) root samples collected from five different fields in Dray Sap Commune, Krong Ana District, Dak Lak Province, the Central Highlands of Vietnam.Using the Illumina MiSeq technology, the 16S rRNA metagenomics was applied to the sequencing amplicons library.The QIIME2 matched with the SILVA SSURef reference database was employed to analyze the taxonomic profile, and the PICRUSt2 and MetaCyc databases were used to predict the functional profile of rice endophytic prokaryotes.Results revealed that Enterobacterales was the most predominant class (57.7%) in the bacterial community, and biosynthesis was the primary function of the rice endophytic microbiome (75.95%).Raw sequences obtained in this work are available from the National Center for Biotechnology Information (NCBI) (Bioproject ID: PRJNA994482) and Mendeley Data [1] .Data in this work provide insight into the endophytic microbiome of rice (RVT variety) cultivated in the Central Highlands of Vietnam.These data are valuable for developing a new method for producing locally sustainable rice employing endophytic bacteria.This is the first report on the endophytic microbiome of rice cultivated in this region.

Value of the Data
• Data provide taxonomic and functional profiles of the endophytic microbiome of rice (RVT variety) grown in Vietnam's Central Highlands.• Data can be helpful for comparing the endophytic microbiome of rice (RVT variety) grown in this region and others.• Data can be useful for developing a new method for producing locally sustainable rice (RVT variety) employing endophytic bacteria.
One of the top exporters and producers of rice worldwide is Vietnam.The Central Highlands region is the capital of both coffee and black pepper production in the country.Other than coffee and black pepper plants, rice is the main annual crop grown in this region.According to a report, Vietnam had 7,238,900 hectares of rice planted and produced 43,852,600 tons in 2021, in which the Central Highlands contributed 250,200 hectares and 1,466,300 tons, respectively [2] .Farmers frequently use chemical fertilizers to boost the yield of rice products in this area.However, chemical fertilizers can contaminate groundwater, are more ecologically resilient, and reduce soil fertility and microorganisms [3] .To produce rice sustainably, it is believed that employing beneficial microbes is the best option.Rhizospheric and endophytic microbiome data of coffee, black pepper, and sugarcane plants have been presented in order to build a new technique for the sustainable production of the primary crops in this region [4][5][6][7] ; nevertheless, the microbiome data of rice is still unknown.This work aimed to establish a dataset of endophytic bacteria of rice (RVT variety) grown in the Central Highlands of Vietnam, using the 16S rRNA metagenomics.

Sampling
On October 30, 2021, five samples of rice roots of RVT variety (each weighing 50-70 g) were taken from five different farms in Dray Sap Commune, Krong Ana District, Dak Lak Province.The roots were collected from 5 distinct locations in each field and then combined into one sample.Finally, the five samples were mixed well and combined into one representative sample.The sample was stored in an ice box (4 °C) and transferred to the laboratory within 2 h.In the laboratory, the root surface was sterilized as previously described [7] to remove microorganisms that adhere to the root surface.The sterilized sample was then maintained at −80 °C until analysis.

Genomic DNA Extraction, Library Preparation, and Sequencing
Following the manufacturer's instructions, 300 mg of the root sample was used to extract the metagenomic DNA of the rice endophytic microbiome using the DNeasy PowerSoil Pro kit (Qiagen, Germany).The 16S rRNA genes of the extracted metagenomic DNA were amplified, and libraries of the amplicons were then created using the Swift amplicon 16S plus internal transcribed spacer panel kit (Swift Biosciences, USA).Finally, the amplicons from the library were sequenced using the Illumina MiSeq platform (2 × 150 PE) [8] .

Bioinformatic Analysis of Data
Data analysis using bioinformatics was done, as previously mentioned [8] .Briefly, Bcl2fastq was used to demultiplex raw basecall sequences.Raw basecall sequences were demultiplexed using Bcl2fastq.Adapters, primers, and low-quality sequences (average score of < 20 and read length of < 100 bp) were removed using the Trimmomatic 0.39 and Cutadapt 2.10.Clustering and dereplication of reads into amplicon sequence variants were carried out using the q2-dada2 plugin and QIIME2 pipeline 2020.8.Taxonomic analysis of rice endophytic microbiome was performed using QIIME2, which was aligned with the SILVA SSURef reference database.The functional profile of rice endophytic prokaryotes was predicted using the PICRUSt2 2.3.0-b and Meta-Cyc databases.

Limitations
Not applicable.

Ethics Statement
The current work does not involve human subjects, animal experiments, or any data collected from social media platforms.

Fig. 1 .
Fig. 1.Taxonomic profiles of the endophytic microbiome of rice cultivated in the Central Highlands, Vietnam.

Fig. 2 .
Fig. 2. Functional profiles of the endophytic microbiome of rice cultivated in the Central Highlands, Vietnam.